Prediction and Inference

Dr Kristy Robledo

The University of Sydney

Inference (explaining) versus Prediction (future)


Explaining

  • collect data from given sample
  • apply models to test hypotheses (explain relationships)

Prediction

  • apply a model to data
  • predicting new or future observations
Explanatory predictive

What is prediction modelling?

Diagnostic vs Prognostic models



Diagnostic
  • Predicts the presence or absence of a disease or condition
  • e.g. breast cancer detection

Prognostic
  • Prediction of a disease course (including treatment)
  • eg. probability of CVD events for a patient with diabetes

Pipeline for translation of a clinical trial into practice

Individual absolute risk (current treatment)

Risk prediction model

x
Relative risk reduction (new treatment)

Predictive Biomarker (clinical trial)

=
Personalised absolute risk reduction

size of benefit



Decision for practice and policy

Methods for developing predictive models

Prediction example: T4DM risk score model

Development cohort: T4DM trial

  • males aged 50 to 74 years,
  • waist circumference ≥95 cm,
  • impaired glucose tolerance or newly diagnosed type 2 diabetes,
  • fasting testosterone ≤14 nmol/L


Primary outcome: Diabetes at two years, as measured by 2-hour glucose by OGTT \(\ge\) 11.1mmol/L

Step one: 35 to 16 risk factors

Step two: select from 16 risk factors + fit model

  • LASSO penalisation1 using 469 with complete data
  • 10 fold cross validation was used to maximise AUC
  • dotted line denotes max AUC, with two non-zero covariates selected in the model (HbA1c and 2-hour glucose)
  • refit without penalisation, with AUC 0.809 (n= 665 patients)
  • including T treatment2, AUC 0.816.

Step three: Validation

  • EXTEND45 cohort of 267,357 participants aged 45 and up
  • conduct approved by the University of New South Wales HREC
  • baseline questionnaires collected Jan 2006 - Dec 2009
  • lab data linked up to July 2013

Step four: Model performance (Discrimination)

Step four: Model performance (Calibration)

Step five: Recalibration

Inference example: T4DM Mediation analysis

Mediation analysis

  • tool to disentangle potential causal pathways in data from clinical trials
  • T4DM found a 40% reduction in diabetes with testosterone treatment
  • also changes in body composition with T treatment (decreased fat mass, increased muscle mass)
  • was it the direct effect of testosterone treatment? Or the testosterone-induced changes in body composition?

:::footer https://academic.oup.com/ejendo/article/189/1/50/7219871

The causal pathway for T4DM

Thank you!